Use of template messages to optimize a software messaging system

ABSTRACT

A method uses template messages to optimize software messaging system. A message is decomposed into a template message portion containing message content, and a field message portion. A correlation identifier identifies a template message and only those template messages with unique correlation identifiers are stored or forwarded. A field message portion includes a correlation identifier associated with a template message. A recomposition function combines a field message portion with the appropriate template message portion as identified in the field message portion.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. Ser. No. 11/947,639, filedNov. 29, 2007, the entire contents of which are incorporated herein byreference.

FIELD OF THE INVENTION

The present application relates to distributed applications and moreparticularly to using template messages to optimize messaging.

BACKGROUND OF THE INVENTION

Distributed software applications often interact across computernetworks via asynchronous messaging protocols. Typically, thosedistributed messaging applications are designed such that the sender andreceiver parts of the applications agree on the content and format ofthe messages to be exchanged. In message passing, it is often the casethat a series of messages will be sent, only to have each message differfrom the rest by a few key fields. That is, the bulk of the messageremains the same across all messages in the series. In such a case,sending the entire message each time is clearly inefficient. Therefore,what is desirable is a method that efficiently handles messages in suchsituations.

BRIEF SUMMARY OF THE INVENTION

A method of optimizing a software messaging system is provided. Themethod in one aspect may comprise autonomously detecting patterns ofrepeated data in a plurality of messages; generating a plurality oftemplate messages, each of said plurality of template messagescontaining a different pattern of repeated data detected in saidplurality of messages; assigning a correlation identifier to said eachtemplate message. Said generating may further include replacing atemplate message with one determined to have more of commonly repeateddata. The method may also include storing said each template messageidentified by a corresponding correlation identifier.

The method may further comprise, for each message being communicated,extracting dynamic portion of said each message and generating a fieldmessage to contain the dynamic portion; selecting a template messagefrom said plurality of template messages having message content of saideach message; associating a correlation identifier identifying saidselected template message with said field message; and communicatingsaid field message to a recipient application.

The method may further comprise, recomposing said each message usingsaid field message and said correlation identifier. The step ofrecomposing may further include searching recipient application's cachestoring a plurality of template messages to select a template messageidentified by said correlation identifier.

The method may also comprise notifying said recipient application whenthere is a change in one or more of the template messages cached in saidrecipient application's cache.

Further features as well as the structure and operation of variousembodiments are described in detail below with reference to theaccompanying drawings. In the drawings, like reference numbers indicateidentical or functionally similar elements.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram illustrating message decomposition andrecomposition of the present disclosure.

DETAILED DESCRIPTION

A method for sending smaller messages in which a series of messageslargely contain the same content is provided. Using smaller sizedmessages to communicate exactly the same information as the larger sizedpredecessor, albeit with an additional indirection, can have a majorimpact on the achievable message throughput rate and also on the cost ofrunning the application, since network usage is often charged by volumetransferred. In addition to the performance gains in terms of bandwidthreduction due to smaller size, there is an additional gain, for free,due to the underlying caching mechanism of typical messaging systems.Messaging systems typically have a memory cache of limited size, beyondwhich messages must be spilled onto the disk. Accessing disk storage isorders of magnitude slower than direct memory access. By restricting thesize of the messages flowing around the system, the cache usage isoptimized as well as the network usage.

A method of the present disclosure in one embodiment enablesapplications to exchange messages in two parts; a set of “templatemessages”, which contain the bulk of the data to be exchanged, whichonly change content infrequently, or which need to be centrallyadministered for consistency of formats between applications; and theseries of “field messages”, which contain the subset of message datawhich changes for every message. An existing messaging system orsoftware may be augmented to provide such functionality. The scope ofusage of the template messages, or the realm of applicability of thetemplate messages, may be the entire messaging system or a logical groupof messaging component(s), and is not limited to a single or a set ofqueue(s), topic(s) or application(s).

FIG. 1 is a block diagram illustrating message decomposition andrecomposition of the present disclosure. A message 102 may be decomposedinto two parts, a template message containing content portion of themessage 104 and a field message 106 containing information or attributesrelated to the message. Template messages may be stored at 108 and fieldmessages may be stored at 110. The storage 108, 110 may be any one orcombination of database, a queue, or cache memory, or like, forinstance, depending on the specific implementation of the method. Adecomposed message may be recomposed by combining a template messageretrieved from the template message store 112 and field message 114retrieved from the store 110. A template message is identified in afield message by a unique identifier.

Consider a series of messages describing a sale of items by auction. Themain message may be boilerplate material describing the nature of thesale, the obligations of the seller and buyer, etc., and the only partof the message that changes in each message may be item name, itemnumber, brief description, reserve price, sale price, seller id, buyerid. In this example, the boilerplate material may form a templatemessage; and the item name, item number, brief description, reserveprice, sale price, seller id, buyer id may form the field message.

As another example, consider a series of messages describing trainingachievements by employees. The main body of each message may include thecourse name, description, pre-requisites, etc., which can be designatedfor being a template message. Unique or different parts of the messagesuch as the trainee name, date taken, examination score, etc., would beplaced in the field message.

In one embodiment a sending application may send its set of templatemessages to a known queue or repository on the messaging system. A setcontains one or many. Each template message includes a uniquecorrelation identifier (id). A sending and receiving applications mayuse an existing pre-defined repository of template messages. Therepository of template messages may be shared among applications orglobal to all applications. An administrator or an application withsufficient authority may add, remove or modify the template messagesdirectly. In order to send the main sequence of messages to therecipient application, the sending application sends field messages 106,each with a correlation id to match the relevant template message 104.The main message flows around the system are therefore the fieldmessages, which are typically small, thus providing the requiredoptimizations.

In order to rebuild the complete messages, the system browses the queue(or repository or cache memory or like) of template messages to pick upthe template messages it needs non-destructively, as well asdestructively getting the field messages containing the dynamic data. Byexamining the correlation id of the field messages, the appropriatetemplate message is looked up, and a known or will be known algorithm isused to insert the contents of the field messages 114 into the specifiedtemplate message 112 in order to build the complete messages 116 for therecipient to use. The method of the present disclosure may beimplemented within, or making use of, a current messaging middlewaretechnology, such as IBM's WebSphere MQ™ product.

As an example, the following algorithm or method maybe used to combinefield messages with template messages. A sending application decomposesits text-based messages according to a set of named fields listed in theappropriate template message. For each named field, its value is removedfrom the original message and added to a field message as a name/valuepair, i.e., field name and value corresponding to that field. A templatemessage may contain named tags or tokens, typically with specialdelimiters such as <and>, in the places where the fields would be in thefull messages. In recomposing at the destination, the template messageis parsed to find the token delimiters, the token name between each pairis read and the token with the value from the field message matching thetoken name is substituted.

For example,

Original message may include:

...content... The item Rocking Chair was sold for $405 by Mrs. A Smith....content

Template message for the above original message may be:

CorrelId=AuctionSale ...content... The item <ItemName> was sold for$<Price> by <Seller>. ...content...

Field message corresponding to the original message then may be:

CorrelId=AuctionSale ItemName=Rocking Chair Price=405 Seller= Mrs. ASmith

In another example, a sending application decomposes its data structuredmessages by comparing a sequence of similar messages to identify thosedata fields that differ between messages. Those fields that are commonto all messages in the sequence are entered into a template message,with named placeholders, typically with a token indicating theirlocation, for the differing fields. These differing fields are put intofield messages. The recomposing application replaces the placeholders ina template message with the equivalent fields from the current fieldmessage, either by name or simply by sequence order.

For example,

Original message may include:

Inventing For Beginners This course instructs.... ... Andrea Smith 27Oct 1998 Pass This qualification... ...

Template message for the original message may be:

CorrelId=1295 Inventing For Beginners This course instructs....... >Trainee >Date >P/F This qualification... ...

Field message corresponding to the original message may then be:

CorrelId=1295 Andrea Smith 27 Oct 1998 Pass

In one embodiment, template messages and field messages are stored onthe messaging system's queues, and the receiving application retrievesboth the template messages (non-destructively) and the field messages(destructively), and performs the message composition within the boundsof the receiving application.

In another embodiment, the receiving application may cache the templatemessages as it gets them. An agent or a process may be deployed tonotify the receiving application when template messages change. Uponbeing notified, the receiving application can clear the templatemessages from its cache. Rather than having a separate agent or processnotify the receiving application of the changes in the templatemessages, a sender application may send an additional notificationmessage via the messaging system whenever a template message is changed.This scheme significantly improves network bandwidth usage betweenqueues and receiver, and is particularly efficient when templatemessages change very infrequently.

Yet in another embodiment, the messaging system may perform the messagecomposition on behalf of the receiving application. In this embodiment,assuming that the messaging system performs the composition on the samecomputer as the receiving application (i.e., within the receiving clientportion of the distributed messaging system), the network bandwidth issimilar to the above embodiment. An advantage of this embodiment is thatthe complexity is removed from the specific application and providedwithin the generic messaging system. In one embodiment, the algorithm(s)for composing field messages into their template messages are generic sothey can be built into the messaging system without knowledge of thespecific formats required by applications.

Still yet in another embodiment, a messaging system composes themessages on the messaging server (i.e., on the computer storing thequeues of messages). This has the advantage of keeping the complexity onthe server system, allowing centralization of the main messagingprocessing, and keeping the client software running with theapplications as simple and small as possible.

Another embodiment may have the messaging system perform both messagedecomposition from full messages sent by a sending application intotemplate messages and field messages, and message recomposition onbehalf of the receiving application. In this embodiment, the sending andreceiving applications are relatively unaware of the decomposition andrecomposition process, in that they send and receive full messages.However, the sending application indicates which templates and/oralgorithms the messaging system should use for transmission of itsmessages. In this embodiment, the method becomes largely an internaloptimization process within the messaging system to enable applicationsto minimize the network bandwidth and storage used by the messages. Thecaching and template message update methodology described above may beapplied within this embodiment.

In another embodiment, a messaging system exercises autonomous selectionof decomposition algorithms. As a series of messages is sent through thesystem, patterns of repeated data are detected by the system. Therepeated data is then extracted into template messages, and subsequentmessages following the same patterns have their dynamic data extractedinto field messages for forwarding, associated with the relevanttemplate messages for recomposition at the receiving end before deliveryto the receiving application.

Pattern recognition may be as simple as monitoring a sequence ofmessages flowing from a sending application and comparing theircontents, either character by character or, in a structured message,field by field may be used for autonomous selection of decompositionalgorithms. As each message is examined, a template message is built upto contain those elements of the messages that are completely commonacross all messages. During this phase of operation, messages may besent complete rather than using templates and fields. Once some nmessages or predetermined number of messages are detected with athreshold of m % of their content completely common, then that templatemessage is brought into fill use. Any subsequent messages alsocompletely matching that template will have their non-common contentextracted into field messages and sent in that form for recompositionwith the template message using algorithms such as those describedabove. During pattern recognition, multiple template messages may bebuilt to match different patterns. An algorithm can be tuned usingvarious thresholds to distinguish between different patterns/templates,to determine when to start and stop using templates, when to replace atemplate with one with even more common content, etc.

In one embodiment, template cache management may be implemented bykeeping the master copy of all template messages on a queue. Thereceiving application, when it gets a field message to recompose, looksfor the template message with a matching correlation id in its localcache. If it is found, then that template message is used forrecomposition. If not, then the template message queue is browsed tofind that template message by correlation id, and a copy is placed inthe local cache for future use, then recomposition continues.

If the sending application (or some other template management system)needs to change a template message, then it also sends a notificationmessage to those applications using the template message queue. Thatnotification will identify, by correlation id, the template message thathas been changed, and the receiving applications will simply remove thattemplate message from their local caches. Next time that templatemessage is required, the changed copy will be retrieved from the queuesince there will not be a cached copy.

The system and method of the present disclosure may be implemented andrun on a general-purpose computer or computer system. The computersystem may be any type of known or will be known systems and maytypically include a processor, memory device, a storage device,input/output devices, internal buses, and/or a communications interfacefor communicating with other computer systems in conjunction withcommunication hardware and software, etc.

The terms “computer system” and “computer network” as may be used in thepresent application may include a variety of combinations of fixedand/or portable computer hardware, software, peripherals, and storagedevices. The computer system may include a plurality of individualcomponents that are networked or otherwise linked to performcollaboratively, or may include one or more stand-alone components. Thehardware and software components of the computer system of the presentapplication may include and may be included within fixed and portabledevices such as desktop, laptop, and server. A module may be a componentof a device, software, program, or system that implements some“functionality”, which can be embodied as software, hardware, firmware,electronic circuitry, or etc.

The embodiments described above are illustrative examples and it shouldnot be construed that the present invention is limited to theseparticular embodiments. Thus, various changes and modifications may beeffected by one skilled in the art without departing from the spirit orscope of the invention as defined in the appended claims.

1. A computer system for optimizing a software messaging system,comprising: a computer-implemented module operable to autonomouslydetect patterns of repeated data in a plurality of messages and generatea plurality of template messages, each of said plurality of templatemessages containing a different pattern of repeated data detected insaid plurality of messages, the computer-implemented module furtheroperable to replace a template message with one determined to have moreof commonly repeated data, said computer-implemented module furtheroperable to assign a correlation identifier to said each templatemessage; a computer storage device operable to store said each templatemessage with a corresponding correlation identifier; a messagedecomposition module operable, for each message being communicated, toextract dynamic portion of said each message and generate a fieldmessage to contain the dynamic portion, the message decomposition modulefurther operable to select a template message from said plurality oftemplate messages having message content of said each message, saidmessage decomposition module further operable to determine a correlationidentifier assigned to said selected template message, and communicatesaid field message to a recipient application with the correlationidentifier that matches the selected template message withoutcommunicating content of said selected template message; a messagerecomposition module operable to receive said field message with thecorrelation identifier and to recompose using said field message andsaid correlation identifier, said message recomposition module furtheroperable to search recipient application's cache storing a plurality oftemplate messages to select a template message having said correlationidentifier, said message recomposition module further operable tonon-destructively retrieve said template message having said correlationidentifier and destructively retrieve said field message, said messagerecomposition module further operable to replace placeholders in saidtemplate message with equivalent fields from said field message bysequence order; and an agent process deployed to notify said recipientapplication when there is a change in one or more of the templatemessages cached in said recipient application's cache, said recipientapplication clearing said one or more of the template messages that havechanged from its cache upon being notified, wherein thecomputer-implemented module autonomously detects patterns, generates aplurality of template messages, assigns a correlation identifier, andstores said each template message with a corresponding correlationidentifier as said each message is being communicated.